Audio/Visual Independent Components

نویسندگان

  • Paris Smaragdis
  • Michael Casey
چکیده

This paper presents a methodology for extracting meaningful audio/visual features from video streams. We propose a statistical method that does not distinguish between the auditory and visual data, but one that operates on a fused data set. By doing so we discover audio/visual features that correspond to events depicted in the stream. Using these features, we can obtain a segmentation of the input video stream by separating independent auditory and visual events.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

L-Ball: Designing A Novel Sports Electronic Audio Ball for Visual Impairment Student

Background. A field study conducted by researchers found the balls used by visual impairment students at school basically used the sound by a small bell inside the ball. However, the sound emitted from the ball is very limited, the ball will sound when it is moved. This makes it difficult for students with visual impairments to find a missing ball that not emitted makes a sound. Objectives. A ...

متن کامل

Audio-visual classification of Swedish phonemes for pronunciation training

We present a method for audio-visual classification of Swedish phonemes, to be used in computer-assisted pronunciation training. The probabilistic kernel-based method is applied to the audio signal and/or either a principal or an independent component (PCA or ICA) representation of the mouth region in video images. We investigate which representation (PCA or ICA) that may be most suitable and t...

متن کامل

Efficient likelihood computation in multi-stream HMM based audio-visual speech recognition

Multi-stream hidden Markov models have recently been introduced in the field of automatic speech recognition as an alternative to single-stream modeling of sequences of speech informative features. In particular, they have been very successful in audio-visual speech recognition, where features extracted from video of the speaker’s lips are also available. However, in contrast to single-stream m...

متن کامل

Audio-visual phoneme classification for pronunciation training applications

We present a method for audio-visual classification of Swedish phonemes, to be used in computer-assisted pronunciation training. The probabilistic kernel-based method is applied to the audio signal and/or either a principal or an independent component (PCA or ICA) representation of the mouth region in video images. We investigate which representation (PCA or ICA) that may be most suitable and t...

متن کامل

Comparing the Impact of Audio-Visual Input Enhancement on Collocation Learning in Traditional and Mobile Learning Contexts

: This study investigated the impact of audio-visual input enhancement teaching techniques on improving English as Foreign Language (EFL) learnersˈ collocation learning as well as their accuracy concerning collocation use in narrative writing. In addition, it compared the impact and efficiency of audio-visual input enhancement in two learning contexts, namely traditional and mo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003